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Abstract 

This  paper  presents  a  proof  of  correctness  of  an  iterative  approximate  Byzantine 
consensus  (IABC)  algorithm  for  directed  graphs.  The  iterative  algorithm  allows  fault- 
free  nodes  to  reach  approximate  conensus  despite  the  presence  of  up  to  /  Byzantine 
faults.  Necessary  conditions  on  the  underlying  network  graph  for  the  existence  of  a 
correct  IABC  algorithm  were  shown  in  our  recent  work  [15,  16].  [15]  also  analyzed  a 
specific  IABC  algorithm  and  showed  that  it  performs  correctly  in  any  network  graph 
that  satisfies  the  necessary  condition,  proving  that  the  necessary  condition  is  also  suffi¬ 
cient.  In  this  paper,  we  present  an  alternate  proof  of  correctness  of  the  IABC  algorithm, 
using  a  familiar  technique  based  on  transition  matrices  [9,  3,  17,  19]. 

The  key  contribution  of  this  paper  is  to  exploit  the  following  observation:  for  a 
given  evolution  of  the  state  vector  corresponding  to  the  state  of  the  fault-free  nodes, 
many  alternate  state  transition  matrices  may  be  chosen  to  model  that  evolution  cor¬ 
rectly.  For  a  given  state  evolution,  we  identify  one  approach  to  suitably  “design”  the 
transition  matrices  so  that  the  standard  tools  for  proving  convergence  can  be  applied 
to  the  Byzantine  fault-tolerant  algorithm  as  well.  In  particular,  the  transition  matrix 
for  each  iteration  is  designed  such  that  each  row  of  the  matrix  contains  a  large  enough 
number  of  elements  that  are  bounded  away  from  0. 
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expressed  here  are  those  of  the  authors  and  do  not  necessarily  reflect  the  views  of  the  funding  agencies  or 
the  U.S.  government. 
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1  Introduction 


Dolev  et  al.  [5]  introduced  the  notion  of  approximate  Byzantine  consensus  by  relaxing 
the  requirement  of  exact  consensus  [14].  The  goal  in  approximate  consensus  is  to  allow 
the  fault-free  nodes  to  agree  on  values  that  are  approximately  equal  to  each  other  (and 
not  necessarily  exactly  identical).  In  presence  of  Byzantine  faults,  while  exact  consensus  is 
impossible  in  asynchronous  systems  [7] ,  approximate  consensus  is  achievable  [5] .  The  notion 
of  approximate  consensus  is  of  interest  in  synchronous  systems  as  well,  since  approximate 
consensus  can  be  achieved  using  simple  distributed  algorithms  that  do  not  require  complete 
knowledge  of  the  network  topology  [4], 

In  this  paper,  we  are  interested  in  iterative  algorithms  for  achieving  approximate  Byzan¬ 
tine  consensus  in  synchronous  point-to-point  networks  that  are  modeled  by  arbitrary  directed 
graphs.  The  iterative  approximate  Byzantine  consensus  (IABC)  algorithms  of  interest  have 
the  following  properties,  which  we  will  soon  state  more  formally: 

•  Initial  state  of  each  node  is  equal  to  a  real-valued  input  provided  to  that  node. 

•  Validity  condition:  After  each  iteration  of  an  IABC  algorithm,  the  state  of  each  fault- 
free  node  must  remain  in  the  convex  hull  of  the  states  of  the  fault-free  nodes  at  the 
end  of  the  previous  iteration. 

•  Convergence  condition:  For  any  e  >  0,  after  a  sufficiently  large  number  of  iterations, 
the  states  of  the  fault-free  nodes  are  guaranteed  to  be  within  e  of  each  other. 

Certain  IABC  algorithms  have  been  shown  to  satisfy  the  above  properties  in  fully  connected 
graphs  [5,  14],  and  in  arbitrary  directed  graphs  satisfying  a  tight  necessary  condition  [15,  16]. 
Please  refer  to  [15,  16]  for  a  summary  of  the  related  work. 

The  main  contribution  of  this  paper  is  to  develop  an  alternate  proof  of  correctness  for  a 
IABC  algorithm,  which  was  proved  correct  in  arbitrary  graphs  that  satisfy  a  necessary  con¬ 
dition  developed  in  our  prior  work  [15].  The  alternate  proof  is  based  on  transition  matrices 
that  capture  the  behavior  of  the  IABC  algorithm  executed  by  the  fault-free  nodes.  This 
work  is  inspired  by,  and  borrows  some  matrix  analysis  tools  from,  other  work  that  also  uses 
transition  matrices  in  related  contexts  [9,  3,  17,  19].  This  paper  exploits  the  following  obser¬ 
vation:  for  a  given  evolution  of  the  state  vector  corresponding  to  the  state  of  the  fault-free 
nodes,  many  alternate  state  transition  matrices  may  potentially  be  chosen  to  emulate  that 
evolution  correctly.  For  a  given  state  evolution,  we  identify  one  approach  to  suitably  “design” 
the  transition  matrices  so  that  the  standard  tools  can  be  applied  to  prove  convergence  of  the 
Byzantine  fault-tolerant  algorithm  in  all  networks  that  satisfy  a  necessary  condition  (proved 
in  [16])  on  the  network  communication  graph.  In  particular,  the  transition  matrix  for  each 
iteration  is  designed  such  that  each  row  of  the  matrix  contains  a  large  enough  number  of 
elements  that  are  bounded  away  from  0. 


2  Network  and  Failure  Models 


Network  Model:  The  system  is  assumed  to  be  synchronous.  The  communication  network 
is  modeled  as  a  simple  directed  graph  G(V,£),  where  V  =  {1, . . .  ,n}  is  the  set  of  n  nodes, 
and  £  is  the  set  of  directed  edges  between  the  nodes  in  V.  Node  i  can  reliably  transmit 
messages  to  node  j  if  and  only  if  the  directed  edge  (i,  j)  is  in  £.  Each  node  can  send 
messages  to  itself  as  well,  however,  for  convenience,  we  exclude  self-loops  from  set  £.  That 
is,  (i,i)  ^  £  for  i  G  V.  With  a  slight  abuse  of  terminology,  we  will  use  the  terms  edge  and 
link  interchangeably  in  our  presentation. 

For  each  node  i,  let  Nf  be  the  set  of  nodes  from  which  i  has  incoming  edges.  That 
is,  N~  —  { j  |  (j,  i)  G  £  }.  Similarly,  define  N+  as  the  set  of  nodes  to  which  node  i  has 
outgoing  edges.  That  is,  Nf~  —  {j  \  (i,j)  E  £}.  Since  we  exclude  self-loops  from  £,  i  $  Nf 
and  i  Nf.  However,  we  note  again  that  each  node  can  indeed  send  messages  to  itself.  A 
necessary  condition  for  correctness  of  an  IABC  algorithm  for  /  >  0  is  that  \N~\  >  2/  [15]. 

Node  j  is  said  to  be  an  incoming  neighbor  of  node  i,  if  j  G  N~ .  Similarly,  j  is  said  to  be 
an  outgoing  neighbor  of  node  i,  if  j  G  Nf~. 


Failure  Model:  We  consider  the  Byzantine  failure  model,  with  up  to  /  nodes  becoming 
faulty.  A  faulty  node  may  misbehave  arbitrarily.  Possible  misbehavior  includes  sending 
incorrect  and  mismatching  (or  inconsistent)  messages  to  different  neighbors.  The  faulty 
nodes  may  potentially  collaborate  with  each  other.  Moreover,  the  faulty  nodes  are  assumed 
to  have  a  complete  knowledge  of  the  execution  of  the  algorithm,  including  the  states  of  all  the 
nodes,  contents  of  messages  the  other  nodes  send  to  each  other,  the  algorithm  specification, 
and  the  network  topology. 


3  Iterative  Approximate  Byzantine  Consensus  (IABC) 


Each  node  i  maintains  state  vtl  with  vt  [t]  denoting  the  state  of  node  i  at  the  end  of  the  f-th 
iteration  of  the  algorithm.  Initial  state  of  node  i,  v,  [0] ,  is  equal  to  the  initial  input  provided 
to  node  i.  At  the  start  of  the  t-tli  iteration  (t  >  0),  the  state  of  node  i  is  vt[t  —  1], 

Let  T  denote  the  set  of  faulty  nodes.  Thus,  the  nodes  in  V  —  J-  are  non-faulty.1 

•  U[t\  =  ma xieV_jr  Vi[t\.  U[t]  is  the  largest  state  among  the  fault-free  nodes  at  the  end 
of  the  f-th  iteration.  Since  the  initial  state  of  each  node  is  equal  to  its  input,  U [0]  is 
equal  to  the  maximum  value  of  the  initial  input  at  the  fault-free  nodes. 

•  p[t]  =  miiijev-jr  Vi[t\.  p[t]  is  the  smallest  state  among  the  fault-free  nodes  at  the  end 
of  the  f-th  iteration.  /x[0]  is  equal  to  the  minimum  value  of  the  initial  input  at  the 

1For  sets  X  and  Y,  X  —  Y  contains  elements  that  are  in  X  but  not  in  Y.  That  is, 
X  —  Y  =  {i  j  i  G  X,  i  1"}. 


fault-free  nodes. 


The  following  conditions  must  be  satisfied  by  an  IABC  algorithm  in  presence  of  up  to  / 
Byzantine  faulty  nodes: 

•  Validity:  Vf  >  0,  n[t]  >  fi[t  —  1]  and  U[t\  <  U[t  —  1] 

•  Convergence:  lim^^  U[t\  —  fi[t\  =  0.  Equivalently,  lim^^  vt [f]  —  Vj[t]  =  0,  for 

i,/eV-T 

An  iterative  algorithm  is  said  to  be  correct  if  it  satisfies  the  validity  and  convergence 
conditions.  We  will  prove  the  correctness  of  Algorithm  1  below  in  all  graphs  that  satisfy  the 
necessary  condition  in  Theorem  2  of  [16].  The  algorithm  should  be  performed  by  each  node 
i  in  the  t-th  iteration,  t  >  1.  The  faulty  nodes  may  deviate  from  the  algorithm  specification. 
If  a  fault-free  node  does  not  receive  an  expected  message  from  an  incoming  neighbor  (in  the 
Receive  step  below),  then  that  message  is  assumed  to  have  some  default  value. 


Algorithm  1 


Steps  to  be  performed  by  node  i  in  the  t-th  iteration: 

1.  Transmit  step:  Transmit  current  state  Vi[t  —  1]  on  all  outgoing  edges. 

2.  Receive  step:  Receive  values  on  all  incoming  edges.  These  values  form  vector  ,rl  [f]  of 
size  |iV“|. 

3.  Update  step:  Sort  the  values  in  rt [f ]  in  an  increasing  order,  and  eliminate  the  smallest 
/  values,  and  the  largest  /  values  (breaking  ties  arbitrarily).  Let  N*[t]  denote  the 
identifiers  of  nodes  from  whom  the  remaining  \N~  —  2/  values  were  received,  and  let 
Wj  denote  the  value  received  from  node  j  G  N*  [£] . 

For  convenience,  define  wy  =  Vi[t  —  1]. 

Observe  that  if  j  G  {i}  U  N*[t]  is  fault-free,  then  Wj  =  Vj[t  —  1]. 

Define 


Vi[t] 


at  Wj 


j£{i}UN*[t] 


(1) 


where 


1  1 
WI-2/  +  1  “  |JV;[i]|  +  l 


Recall  that  i  (jL  N* [t]  because  (i,i)  ^  £■  The  “weight”  of  each  term  on  the  right-hand 
side  of  (1)  is  ai:  and  these  weights  add  to  1. 


Observe  that  0  <  a*  <  1. 


For  future  reference,  let  us  define  a  as: 

a  =  min  a*  (2) 

iev 

Note  that  0  <  a  <  1.  Specifically,  cc  is  a  positive  constant  that  is  dependent  only  on 
/  and  the  graph  G(V,  £). 


Similar  algorithms  have  been  proven  to  work  correctly  in  fully  connected  graphs  [5,  15] 
and  arbitrary  directed  graphs  satisfying  the  necessary  condition  stated  in  [15].  In  this  paper, 
we  provide  an  alternate  proof  of  correctness  in  such  arbitrary  graphs,  using  an  alternate  form 
of  the  necessary  condition  [16]. 


4  Matrix  Preliminaries 


We  use  boldface  upper  case  letters  to  denote  matrices,  rows  of  matrices,  and  their  elements. 
For  instance,  H  denotes  a  matrix,  H,  denotes  the  i-th  row  of  matrix  H,  and  denotes  the 
element  at  the  intersection  of  the  i-th  row  and  the  j-th  column  of  matrix  H. 

Definition  1  A  vector  is  said  to  be  stochastic  if  all  the  elements  of  the  vector  are  non¬ 
negative,  and  the  elements  add  up  to  1.  A  matrix  is  said  to  be  row  stochastic  if  each  row  of 
the  matrix  is  a  stochastic  vector. 

For  a  row  stochastic  matrix  A,  coefficients  of  ergodicity  5(A)  and  A  (A)  are  defined  as 
[18]: 


5(A)  :=  max  max  |A,;U-  —  Ai2j\,  (3) 

j  U4  2 

A(A)  :=  1  -  min  V"  min(Ani  ,  Ai2j).  (4) 

H42  L ' 

j 

It  is  easy  to  see  that  0  <  5(A)  <  1  and  0  <  A(A)  <  1,  and  that  the  rows  are  all  identical  if 
and  only  if  5(A)  =  0.  Additionally,  A(A)  =  0  if  and  only  if  5(A)  =  0. 

The  next  result  from  [8]  establishes  a  relation  between  the  coefficient  of  ergodicity  5(-)  of 
a  product  of  row  stochastic  matrices,  and  the  coefficients  of  ergodicity  A(-)  of  the  individual 
matrices  defining  the  product. 

Claim  1  For  any  p  square  row  stochastic  matrices  Q(l),  Q(2), . . .  Q (p), 


i(Q(l)Q(2)-..Q(p))  <  n?=1  A(Q(i)). 


(5) 


Claim  1  is  proved  in  [8].  It  implies  that  if,  for  all  i,  A(Q(i))  <1—7  for  some  7  >  0,  then 
<5(Q(1),  Q(2)  •  ■  ■  Q (p))  will  approach  zero  as  p  approaches  00. 

Definition  2  A  row  stochastic  matrix  H  is  said  to  be  a  scrambling  matrix,  if  A(H)  <  1 
[8,  18], 

In  a  scrambling  matrix  H.  since  A(H)  <  1,  for  each  pair  of  rows  i\  and  there  exists  a 
column  j  (which  may  depend  on  R  and  *2)  such  that  H,,  3  >  0  and  H i2j  >  0,  and  vice-versa 
[8,  18].  As  a  special  case,  if  any  one  column  of  a  row  stochastic  matrix  H  contains  only  non¬ 
zero  elements  that  are  lower  bounded  by  some  constant  7  >  0,  then  H  must  be  scrambling, 
and  A(  H)  <  1  -  7. 


5  Matrix  Representation  of  Algorithm  1 

Recall  that  T  is  the  set  of  faulty  nodes.  Let  | J^l  =  0.  Without  loss  of  generality,  suppose 
that  nodes  1  through  (n  —  0)  are  fault-free,  and  if  0  >  0,  nodes  (n  —  0  +  1)  through  n  are 
faulty. 

Denote  by  v[0]  the  column  vector  consisting  of  the  initial  states  of  all  the  fault-free  nodes. 
Denote  by  v[f],  where  t  >  1,  the  column  vector  consistsing  of  the  states  of  all  the  fault- free 
nodes  at  the  end  of  the  f-th  iteration,  t  >  1.  The  i-th  element  of  vector  v[f]  is  state  Vi[t]. 
The  size  of  the  column  vector  v[f]  is  (n  —  0). 


Claim  2  We  can  express  the  iterative  update  of  the  state  of  a  fault-free  nodei  (1  <  i  <  n  —  0) 
performed  in  (1)  using  the  matrix  form  in  (6)  below,  where  M,  [t]  satisfies  the  following  four 
conditions. 

Vi[t\  =  Mj[f]  v[f  —  1]  (6) 

In  addition  to  t,  the  row  vector  M,  [t]  may  depend  on  the  state  vector  v[t  —  1]  as  well  as 
the  behavior  of  the  faulty  nodes  in  J For  simplicity,  the  notation  does  not  explicitly 

represent  this  dependence. 

1.  Mj  [t]  is  a  stochastic  row  vector  of  size  (n  —  0).  Thus.  M  ^  [t]  >  0,  for  1  <  j  <  n  —  <f), 
and 

E  =  1 

2.  equals  at  defined  in  Algorithm  1.  Recall  that  ax  >  a. 

3.  is  non-zero  only  if  (j,  i)  6  £  or  j  =  i. 

4 ■  At  least  \Nf  D  (V  —  IF)  \  —  f  +  1  elements  in  M00  are  lower  bounded  by  some  constant 
0  >  0,  to  be  defined  later  (fi  is  independent  of  i).  Note  that  NR  D  (V  —  IF)  is  the  set 
of  fault-free  incoming  neighbors  of  node  i. 


Proof:  The  proof  of  this  claim  is  presented  in  Section  5.1  below.  The  last  condition  above 
plays  an  important  role  in  the  proof,  and  the  main  contribution  of  this  paper  is  to  “design” 
Mj [t]  to  make  this  condition  true.  □ 


By  “stacking”  (6)  for  different  i,  1  <  %  <  n  —  0,  we  can  represent  the  state  update  for  all 
the  fault-free  nodes  together  using  (7)  below,  where  M[£]  is  a  (n  —  0)  x  (n  —  0)  matrix,  with 
its  i-th  row  being  equal  to  M,: [t]  in  (6). 

v[<]  =  M[f]  v  [t  1]  (7) 

The  four  properties  of  M,[£]  imply  that  M[t]  is  a  row  stochastic  matrix  with  a  non-zero 
diagonal.  Also,  the  i-th  row  of  M[£]  contains  \N~  D  (V  —  J-) \  —  f  + 1  elements  lower  bounded 
by  f3  (f3  will  be  defined  later).  This  property  of  M[f]  turns  out  to  be  important  in  proving 
convergence  of  Algorithm  1. 

M[f]  is  said  to  be  a  transition  matrix. 

By  repeated  application  of  (7),  we  obtain: 

v[t]  =  (n*=1M[i])  v[0] 

5.1  Correctness  of  Claim  2 

Figure  1  illustrates  the  various  sets  used  here.  Some  of  the  sets  in  this  figure  are  not  yet 
defined,  and  will  be  defined  later  in  the  paper. 

We  prove  the  correctness  of  Claim  2  by  constructing  Mj  [£]  for  1  <  i  <  n  —  <j>  that  satisfies 
the  conditions  in  Claim  2.  Recall  that  nodes  1  through  n  — 0  are  fault-free,  and  the  remaining 
0  nodes  (0  <  /)  are  faulty. 

Consider  a  fault-free  node  i  performing  the  Update  step  in  Algorithm  1.  Recall  that  the 
largest  /  and  the  smallest  /  values  are  eliminated  from  rx  [0 .  Let  us  denote  by  L  and  S, 
respectively,  the  set  of  nodes2  from  whom  the  largest  /  values  and  the  smallest  /  values 
were  received  by  node  i  in  iteration  t.  Thus,  \L\  =  |S|  =  /,  N*[t]  =  N~  —  (L  U  S'),  and 

wm\  =  \^  ~(LUS)\  =  \N-\-2f. 

For  any  set  of  nodes  X  here,  let  5x  and  gx  respectively  denote  the  number  of  faulty  nodes, 
and  the  number  of  fault-free  nodes,  in  set  X.  For  instance,  5l  and  denote,  respectively, 
the  number  of  faulty  and  fault-free  nodes  in  set  L.  Thus, 

h  +  9L  =  ds  +  gs  =  f 


Let 

5  =  |A7-  n  J^l 

2 Although  L  and  S  may  be  different  for  each  t,  for  simplicity,  we  do  not  explicitly  represent  this  depen¬ 
dence  on  t  in  the  notations  L  and  S. 


That  is,  the  number  of  faulty  incoming  neighbors  of  node  i  is  denoted  as  8.  Therefore, 
8  <  0  <  /,  and 

8  =  8l  +  8s  +  8x?[t] 


Then,  it  follows  that 

9l  =  f  —  8l  =  8S  +  SN*[t]  +  (/  -  8),  and  (8) 

9s  =  f  ~  8s  =  8L  +  8N*[t]  +  (f  ~  8)  (9) 

For  fault-free  node  i,  we  now  define  the  elements  of  row  M?;  [t] .  We  consider  two  cases 
separately:  (i)  /  —  8  +  5y*[t]  =  0,  and  (ii)  /  —  8  +  <5/v*[t]  >  0. 


5.1.1  /  —  8  +  8is*[t]  —  0 

We  know  that  (f  —  8)  >  0  and  ^ jv* [*]  >  0.  Therefore,  f  —  8  +  8N*{t]  =0  implies  that  f  =  8 
and  8is*[t}  =  0.  Thus,  in  this  case,  all  the  nodes  in  N*[t]  are  fault-free. 

•  For  each  j  e  {z}  LJ  N* [t] ,  define  [t]  =  di.  Element  M,:j  [t]  corresponds  to  the  term 
CLiWj  in  (1). 

Recall  that  a*  >  a,  and  that  each  node  in  {z}  U  N* [t]  in  this  case  is  fault-free. 

•  For  each  j  such  that  j  e  V  —  T  and  j  ^  {i}  U  N* [t] ,  define  =  0. 


Observe  that  with  the  above  definition  of  elements  of  M,  [t] , 

Mj[t]v[t  -  1]  =  ^  aiWk 

ke{i}UN*[i\ 

In  the  above  procedure,  we  have  set  N* [f]  |  +  1  elements  of  M,[f]  equal  to  a.;  (recall  that 

Oj  >  a). 

Now,  because  5  =  f  and  jiV*[£]|  =  |-/V~|  —  2/,  we  have  | Nf  fl  (V  —  W)|  —  /  +  1  = 
|A^r|  —  5  —  f  +  1  =  \Nf\  —  2/  +  1  =  |iV*[i]|  +  1.  Also,  in  this  case  a*  =  1/ ( | N*[t]  \  +  1).  Thus, 
it  should  be  easy  to  see  that  the  conditions  in  Claim  2  are  satisfied  by  defining  (3  =  a. 


5.1.2  f  —  S  +  SN*[t]  >  0 

Since  0  <  5jv*[t]  <  <5  <  /,  /  —  5  +  <5v*[t]  >  0  implies  that  /  >  0.  When  /  >  0,  the  necessary 

condition  in  [15]  implies  that  \N~ |  >  2/  +  1.  Therefore,  the  set  N*[t]  is  non-empty.  As  per 

(1),  each  node  k  £  N*[t]  contributes  axWk  to  the  new  state  Vi[t]  of  node  i.  We  will  define 
elements  of  M,[f]  to  account  for  the  contribution  of  each  node  k  £  N*[t], 

Define  subsets  L*  and  S*  such  that  L*  C  L,  S*  C  S,  L*  fl  J7  =  S*  fl  T  =  <L,  and 
L*  =  |  S'*  |  =  /  —  8  +  That  is,  sets  L*  and  S*  are  subsets  of  L  and  S,  respectively, 

each  of  size  f  —  §  +  S v*[t],  and  containing  only  fault-free  nodes.  Expressions  (8)  and  (9)  for 
gL  and  gs  imply  that  such  subsets  exist. 

Let 

L*  =  {l j  |  1  <  j  <  f  —  5  +  <5tv*  [i] } 

and 

S'*  =  {sj  |  1  <  j  <  f  -  6  +  Atv* [t] }  - 
Consider  any  node  k  £  N* [f] .  For  each  j,  1  <  j  <  f  —  5  +  dN*[t], 

vSj  [t  -  1}  <  wk  <  vtj  [t  -  1] 

Therefore,  we  can  find  weights  \kj  >  0  and  iftkj  >  0  such  that 

Aa:.j  T  tpk.j  1 

and 

wk  =  Afcj  vx.  [t  -  1]  +  i>k,j  vSj  [t  -  1] 

Clearly,  at  least  one  of  the  weights  A k,j  and  ip kj  must  be  >  1/2.  Now,  observe  that 
=  f  A  +  A  M  I]  (hjVh[t-l]+ll>kjVa.[t-l]) 

f-s  +  sNm  ^f^+SNnt] 


Oj  Wk 


(10) 


The  above  equality  is  true  independent  of  whether  k  is  fault-free  or  faulty.  We  will  later  use 
the  above  equality  for  the  case  when  A;  is  a  faulty  node.  When  k  is  fault-free, 


u>k  =  vk[t  -  1], 

and  we  can  similarly  obtain  the  equality  below. 


Ql%  r  -I 

a'Wk  =  2  *’*'*- 11  +  2(/ -«  +  «*.„) 


(Afcj  vh  [t  -  1]  +  if>kj  vSj  [t  -  1]) 


^+<5 AT* 


[*] 


in) 


We  now  use  (1),  (10)  and  (11)  to  define  elements  of  M.;[t]  in  the  following  four  cases: 


•  Case  1:  Node  i 

Define  =  a*.  This  is  obtained  by  observing  in  (1)  that  the  contribution  of  node 

i  to  the  new  state  Vi[t]  is  a^Wi  =  a,iVi[t  —  1]. 

•  Case  2:  Fault-free  nodes  in  N*[t] 

For  each  k  G  N*[t]  fl  (V  —  J7),  define  =  y .  This  choice  is  motivated  by  (11) 

wherein  the  contribution  of  node  k  to  aiWk  is  Yvk[t  —  1]-  In  Case  2,  \N*[t]  fl  (V  — J7)]  = 
\Nr\  —  §  elements  of  M, [t]  are  defined. 


•  Case  3:  Nodes  in  L*  and  S* 

For  1  <  j  <  f  —  8  +  <5jv*[t],  consider  lj  G  L*.  In  this  case, 


M; 


,w  =  E 


keN*[t]nT 


f-S  +  5_ 


mt] 


A k,j  + 


E 


Cii 


keN*[t]n{v-T) 


2  (/  -  5  +  &N*  [4] ) 


A 


k,j 


Similarly,  for  1  <  j  <  f  —  <5  +  Tv*[t],  consider  Sj  G  S*.  In  this  case, 


m.„w  =  E 


k&N*[t]  nr 


/  -  5  +  SN*[t] 


fpkj  + 


E 


keN?[t]n(v-T) 


2  (/  —  ^  +  <bv*[t]) 


^ k,j 


These  expressions  are  obtained  by  summing  (10)  and  (11),  respectively,  over  the  faulty 
and  fault-free  nodes  in  N*[t],  and  then  identifying  the  contribution  of  each  node  in 
L*  and  S*  to  this  sum.  Recall  the  earlier  observation  that  at  least  one  of  and 


iftkj  must  be  >  1/2  for  each  pair  k,j  where  k  G  N*[t]  and  1  <  j  <  f  —  5  +  8jsr*[t]- 
Therefore,  it  follows  that  at  least  /  —  <5  +  <bv*[t]  elements  of  M., [t]  defined  in  Case  3 
must  be  >  ...  TA - -. 

4(/— 0+°N?[t]) 


•  Case  4:  Nodes  in  (V  -  T)  -  ({*}  U  N*[t\  UFU  S*) 

These  fault-free  nodes  have  not  yet  been  considered  in  Cases  1,  2  and  3.  For  each  node 
k  G  (V  —  J7)  —  ({i}  U  N*[t]  U  L*  U  S*),  we  assign  =  0. 


Observe  that  above  the  definition  of  the  elements  of  M,  [t]  ensures  that 

y,  a,Wj  =  Mj[t]v[f  —  1] 

je{i}uN*[t] 

However,  the  contribution  by  the  faulty  nodes  in  N*[t]  in  (1)  is  now  replaced  by  an  equivalent 
contribution  by  the  nodes  in  L*  and  S*. 

Now  let  us  verify  that  the  four  conditions  in  Claim  2  hold  for  the  above  assignments  to 
the  elements  of  M,  [t] . 

1.  Observe  that  all  the  elements  of  M, [t]  are  non- negative.  Case  1  specifies  just  M„: [t]  = 
cti .  The  elements  of  M,  [t]  specified  in  Case  2  add  up  to 

|  |JV*[«]n(V-  j-)| 

Recall  that  for  each  j,  1  <j<  (f  —  5  +  Sn*^),  A  kj  +  'f’kj  =  1  for  fce  Therefore, 

when  added  over  all  k  G  N*[t]  and  1  <  j  <  (/  —  5  +  Jjv*[t]),  the  elements  of  M, [t] 
specified  in  Case  3  add  up  to 

a,  |JV?[i]n.F|  +  |  |Jv;[«]n(v-.F)| 

Therefore,  when  all  the  elements  of  M*[f]  defined  in  Cases  1,  2  and  3  are  added  together, 
we  get 

a,  +  di  \N*[t]  nT\  +  di\N*[t]  n  (V  —  J^) I  =  a,(|iv;[f]|  +  l)  =  1 

because  a*  =  1/ (  N* [t]  +  1).  Now  observe  that  the  elements  specified  in  Cases  1, 
2  and  3  are  clearly  <  1.  In  the  expression  for  in  Case  3,  observe  that  the 

two  summations  on  the  right  side  together  contain  N*  [t]  terms,  and  in  these  terms, 
observe  that  A k,j  <  1,  /  —  8  +  <bv*[t]  >  1  and  a*  =  [jy* m|+1  •  Therefore,  [t]  <  1. 
Similarly,  we  can  show  that  MiSj  [f]  <  1  as  well. 

Thus,  we  have  shown  that  M,  [t]  is  a  stochastic  vector. 

2.  =  a*  as  specified  in  Case  1. 

3.  Since  M,:] [t]  is  defined  to  be  non-zero  only  in  Cases  1,  2  and  3,  which  consider  the 
nodes  only  in  {i}  U  N~ ,  it  follows  that  Mi; [t]  is  non-zero  only  if  ( j ,  i)  G  S  or  j  —  i. 

4.  Cases  1  and  2  together  set  1  +  | N*  [t]  fl  (V  —  J7)  |  =  1  + 1 N*  [f]  |  —  8 n* [t]  elements  of  M* [ t ]  to 

be  >  ail 2.  We  observed  earlier  that  Case  3  results  in  at  least  f  —  5  +  <bv*[t]  elements  of 
Mj[t]  being  >  ^  ■  Also,  observe  that  the  elements  of  M,  [t]  specified  in  Cases 

1  and  2  are  distinct  from  those  specified  in  Case  3,  and  that  y  >  4 y  Thus, 
overall,  at  least 


(1  +  \N?[t]\  —  SN* [(])  +  f  -  5  +  SN*[t]  —  \N*[t]\  +  f  -  5  +  1  —  \N{  \  -  f  -  5  +  1 

=  \Nrn(V- F)\- /-.l 


elements  of  M,  [t]  are  set  >  ^  •  Derivation  of  the  above  equation  uses  the 

facts  that  |iV*[i]|  =  \Nf\  —  2/  and  \N~  D  (V  —  F)\  =  \N(~\  —  5.  Then  by  defining  /3  as 
below,  condition  4  in  Claim  2  holds  true. 


4  (/  —  S  +  SN*[t]) 

Therefore,  Claim  2  is  proved  correct. 


5.2  Correspondence  Between  Sufficiency  Condition  and  M[t] 

Let  us  define  set  Rjr  of  subgraphs  of  G(V,£)  as  follows. 

Rjr  =  {H  |  H  is  obtained  by  removing  all  the  faulty 
nodes  from  V  along  with  their  edges,  and  then 
removing  any  additional  /  incoming  edges 

at  each  fault-free  node}  (12) 

Thus,  V  —  T  is  the  set  of  nodes  in  each  graph  in  Rjr. 

Let  r  denote  Rjr \ .  r  depends  on  T  and  the  underlying  network,  and  it  is  finite. 

Claim  3  Suppose  that  graph  G{V,£)  satisfies  the  necessary  condition  in  Theorem  2  in  [16]. 
Then  it  follows  that  in  each  H  G  Rjr,  there  exists  at  least  one  node  that  has  directed  paths 
to  all  the  nodes  in  H  (consisting  of  the  edges  in  H ). 


Proof:  The  proof  follows  from  Theorem  2  of  [16].  □ 

In  this  discussion,  let  us  denote  a  graph  by  an  italic  upper  case  letter,  and  the  correspond¬ 
ing  connectivity  matrix  using  the  same  letter  in  boldface  upper  case.  Thus,  H  will  denote 
the  connectivity  matrix  for  graph  H  G  Rjr]  H  is  defined  as  follows:  (i)  for  1  <  i,  j  <  n —  (f>\ 
if  there  is  a  directed  link  from  node  j  to  node  i  in  graph  H  then  HtJ  =  1,  and  (ii)  H„;  =  1 
for  1  <  i  <  n  —  <j>.  Note  that  in  our  notation,  the  i-th  row  of  H  (that  is,  H,)  corresponds 
to  the  incoming  links  at  node  i,  and  the  self-loop  at  node  i.  The  connectivity  matrix  H  for 
any  H  G  Rjr  has  a  non-zero  diagonal. 

Lemma  1  For  any  H  G  Rjr,  has  at  least  one  non-zero  column. 


Proof:  By  Claim  3,  in  graph  H  there  exists  at  least  one  node,  say  node  k,  that  has  a 
directed  path  in  FI  to  all  the  remaining  nodes  in  H .  Since  the  length  of  the  path  from  k  to 


any  other  node  in  H  can  contain  at  most  n  —  (f>  —  1  directed  edges,  the  A;- th  column  of  matrix 
wiH  be  non-zero.3  □ 


Definition  3  We  will  say  that  an  element  of  a  matrix  is  “non-trivial”  if  it  is  lower  bounded 
by  0. 


Definition  4  For  matrices  A  and  B  of  identical  size,  and  a  scalar  7,  A  <  7  B  provided 
that  A ij  <  7B ij  for  all  i,j. 


Lemma  2  For  any  t  >  1,  there  exists  a  graph  H[t ]  E  Rjr  such  that  /3H[f]  <  M[t], 


Proof:  Observe  that  the  i-th  row  of  the  transition  matrix  M[f]  corresponds  to  the  state 
update  performed  at  fault-free  node  i.  Recall  from  Claim  2  that  the  M,j  is  non-zero  only 
if  link  (j,i)  E  S.  Also,  by  Claim  2,  M,;[f]  (i.e.,  the  i-th  row  of  M[f])  contains  at  least 
\N~  fi  (V  —  J7)  |  —  /  +  1  non-trivial  elements  corresponding  to  fault-free  incoming  neighbors 
of  node  i  and  itself  (i.e.,  the  diagonal  element). 

Now  observe  that,  for  any  subgraph  F[  E  Rt,  i-th  row  of  H  contains  exactly  \N~  fl  (V  — 
IF)\  —  f  +  1  non-zero  elements,  including  the  diagonal  element. 

Considering  the  above  two  observations,  and  the  definition  of  set  Rjr,  the  lemma  follows. 

□ 


6  Correctness  of  Algorithm  1 


The  proof  below  uses  techniques  also  applied  in  prior  work  (e.g.,  [9,  3,  17,  19]),  with  some 
similarities  to  the  arguments  used  in  [17,  19]. 


Lemma  3  In  the  product  below  of  H [t]  matrices  for  consecutive  Tfn  —  f))  iterations,  at  least 
one  column  is  non-zero. 

n^Rn-^-l 

3That  is,  all  the  elements  of  the  column  will  be  non-zero  (more  precisely,  positive,  since  the  elements  of 
matrix  H  are  non-negative).  Also,  such  a  non-zero  column  will  exist  in  H too.  We  use  the  loose  bound 
of  n  —  </>  to  simplify  the  presentation. 


Proof:  Since  the  above  product  consists  of  r(n  —  0)  matrices  in  Rjr,  at  least  one  of  the 
r  distinct  connectivity  matrices  in  Rjr,  say  matrix  H*,  will  appear  in  the  above  product  at 
least  n  —  p  times. 

Now  observe  that:  (i)  By  Lemma  1,  contains  a  non- zero  column,  say  the  k-th 

column  is  non-zero,  and  (ii)  all  the  H[t]  matrices  in  the  product  contain  a  non- zero  diagonal. 
These  two  observations  together  imply  that  the  k-th  column  in  the  above  product  is  non-zero. 

□ 

Let  us  now  define  a  sequence  of  matrices  Q(i)  such  that  each  of  these  matrices  is  a 
product  of  r(n  —  0)  of  the  M[t]  matrices.  Specifically, 

QW  =  M[«] 

Observe  that 

v[kr(n  —  0)]  =  (nf=1  Q(i))  v[0]  (13) 

Lemma  4  For  i  >  1,  Q (i)  is  a  scrambling  row  stochastic  matrix,  and  A(Q(i))  is  bounded 
from  above  by  a  constant  smaller  than  1. 


Proof:  Q(i)  is  a  product  of  row  stochastic  matrices  (M[f]),  therefore,  Q(i)  is  row  stochastic. 


From  Lemma  2,  for  each  t, 


f3H[t]  <  M[t] 


Thprpforp 

pT(n—<j>)  nMn: «  (B_^)+1  H[f]  <  Q(<) 

By  using  z  —  (i  —  l)(n  —  0)  +  1  in  Lemma  3,  we  conclude  that  the  matrix  product  on  the  left 
side  of  the  above  inequality  contains  a  non- zero  column.  Therefore,  Q(i)  contains  a  non- zero 
column  as  well.  Therefore,  Q(i)  is  a  scrambling  matrix. 


Observe  that  r(n  —  0)  is  finite,  therefore,  0r(n-O  is  non-zero.  Since  the  non-zero  terms 
in  H[t]  matrices  are  all  1,  the  non-zero  elements  in  Ll^^_^r^n_^+1H[i]  must  each  be  >  1. 
Therefore,  there  exists  a  non- zero  column  in  Q(i)  with  all  the  elements  in  the  column  being 
>  0r(n-<W.  Therefore  A(Q(i))  <  1  —  □ 


Theorem  1  Algorithm  1  satisfies  the  validity  and  the  convergence  conditions. 


Proof:  Since  v[f]  =  M  [t]  v [t  —  1],  and  M0]  is  a  row  stochastic  matrix,  it  follows  that 
Algorithm  1  satisfies  the  validity  condition. 


By  Claim  1, 

lim  i(nUM[«])  < 


lim  n‘.jA(M[(]) 

t—¥  OO 

(14) 

lim  nJl(1"-*)JA(Q(i)) 

% — ^OO 

(15) 

0 

(16) 

< 


The  above  argument  makes  use  of  the  facts  that  A(M[t])  <  1  and  A(Q(£))  <  (1— /dT(n_<^)  <  1. 
Thus,  the  rows  of  fl[=1M[£]  become  identical  in  the  limit.  This  observation,  and  the  fact 
that  v[£]  =  (II*=1M[i])v[£  —  1]  together  imply  that  the  state  of  the  fault-free  nodes  satisfies 
the  convergence  condition. 

Now,  the  validity  and  convergence  conditions  together  imply  that  there  exists  a  positive 
scalar  c  such  that 

lim  v[t]  =  lim  (lI-=1M[i]))  v[0]  =  cl 

t— »oo  t— ► oo  ' 

where  1  denotes  a  column  with  all  its  elements  being  1. 


□ 


7  Extension  of  Above  Results 


In  this  paper,  we  analyzed  IABC  Algorithm  1  designed  for  synchronous  systems.  Similar 
analysis  also  applies  for  IABC  Algorithm  2  presented  in  [16]  for  asynchronous  systems. 

The  analysis  will  also  naturally  extend  to  an  IABC  algorithm  for  the  partially  synchronous 
algorithmic  model  presented  in  [4],  which  assumes  a  bounded  delay  in  propagation  of  state 
between  neighbors,  and  a  bounded  delay  between  consecutive  state  updates  at  each  node 
in  the  network.  The  generalization  of  Algorithm  1  to  the  partially  synchronous  algorithmic 
model  will  allow  a  node  i,  if  performing  state  update  in  iteration  t,  to  form  vector  rt [f  ] 
using  the  most  recent  known  states  of  its  incoming  neighbors;  these  states  of  the  neighbors 
may  correspond  to  any  of  the  prior  B  iterations,  for  some  bounded  B.  A  similar  IABC 
algorithm  can  also  be  used  in  time- varying  network  topologies  (i.e.,  networks  wherein  the 
set  of  links  available  in  iteration  t  varies  with  £);  the  above  analysis  will  then  extend  to  time- 
varying  topologies  as  well,  with  the  algorithm  performing  correctly  so  long  as  the  connectivity 
matrices  for  the  graphs  at  different  £  jointly  satisfy  some  reasonable  properties,  as  in  [9,  3,  17]. 


8  Summary 


We  presented  a  proof  of  validity  and  convergence  of  Algorithm  1  by  expressing  the  algorithm 
in  the  matrix  form.  The  main  contribution  of  the  paper  is  to  express  the  algorithm  in 
matrix  form  that  allows  us  to  prove  its  convergence  under  certain  necessary  conditions  on 
the  underlying  communication  graph.  Thus,  the  proof  implies  that  the  necessary  conditions 
are  also  sufficient.  The  key  to  the  proof  is  to  “design”  the  transition  matrix  for  each  iteration 
such  that  each  row  of  the  matrix  contains  a  large  enough  number  of  elements  that  are 
bounded  away  from  0. 
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